Semi-Stochastic Coordinate Descent
نویسندگان
چکیده
We propose a novel stochastic gradient method—semi-stochastic coordinate descent (S2CD)— for the problem of minimizing a strongly convex function represented as the average of a large number of smooth convex functions: f(x) = 1 n ∑ i fi(x). Our method first performs a deterministic step (computation of the gradient of f at the starting point), followed by a large number of stochastic steps. The process is repeated a few times, with the last stochastic iterate becoming the new starting point where the deterministic step is taken. The novelty of our method is in how the stochastic steps are performed. In each such step, we pick a random function fi and a random coordinate j—both using nonuniform distributions—and update a single coordinate of the decision vector only, based on the computation of the j partial derivative of fi at two different points. Each random step of the method constitutes an unbiased estimate of the gradient of f and moreover, the squared norm of the steps goes to zero in expectation, meaning that the stochastic estimate of the gradient progressively improves. The complexity of the method is the sum of two terms: O(n log(1/ǫ)) evaluations of gradients ∇fi and O(κ̂ log(1/ǫ)) evaluations of partial derivatives ∇jfi, where κ̂ is a novel condition number. JK acknowledges support from Google through the Google European Doctoral Fellowship in Optimization Algorithms. ZQ and PR would like to acknowledge support from the EPSRC Grant EP/K02325X/1, Accelerated Coordinate Descent Methods for Big Data Optimization. A short version of this paper (5 pages; including the main result but without proof) was posted on arXiv on October 16, 2014 [7]. The paper was accepted for presentation at the 2014 NIPS Optimization for Machine Learning workshop in a peer reviewed process. The accepted papers are listed on the website of the workshop, but are not published in any proceedings volume. 1
منابع مشابه
S2CD: Semi-Stochastic Coordinate Descent
We propose a novel reduced variance method—semi-stochastic coordinate descent (S2CD)—for the problem of minimizing a strongly convex function represented as the average of a large number of smooth convex functions: f(x) = 1 n ∑ i fi(x). Our method first performs a deterministic step (computation of the gradient of f at the starting point), followed by a large number of stochastic steps. The pro...
متن کاملar X iv : 1 41 2 . 62 93 v 1 [ cs . N A ] 1 9 D ec 2 01 4 Semi - Stochastic Coordinate Descent ∗
We propose a novel stochastic gradient method—semi-stochastic coordinate descent (S2CD)— for the problem of minimizing a strongly convex function represented as the average of a large number of smooth convex functions: f(x) = 1 n ∑ i fi(x). Our method first performs a deterministic step (computation of the gradient of f at the starting point), followed by a large number of stochastic steps. The...
متن کاملTrading Computation for Communication: Distributed Stochastic Dual Coordinate Ascent
We present and study a distributed optimization algorithm by employing a stochastic dual coordinate ascent method. Stochastic dual coordinate ascent methods enjoy strong theoretical guarantees and often have better performances than stochastic gradient descent methods in optimizing regularized loss minimization problems. It still lacks of efforts in studying them in a distributed framework. We ...
متن کاملRandomized Block Coordinate Descent for Online and Stochastic Optimization
Two types of low cost-per-iteration gradient descent methods have been extensively studied in parallel. One is online or stochastic gradient descent ( OGD/SGD), and the other is randomzied coordinate descent (RBCD). In this paper, we combine the two types of methods together and propose online randomized block coordinate descent (ORBCD). At each iteration, ORBCD only computes the partial gradie...
متن کاملFastest Rates for Stochastic Mirror Descent Methods
Relative smoothness a notion introduced in [6] and recently rediscovered in [3, 18] generalizes the standard notion of smoothness typically used in the analysis of gradient type methods. In this work we are taking ideas from well studied field of stochastic convex optimization and using them in order to obtain faster algorithms for minimizing relatively smooth functions. We propose and analyze ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Optimization Methods and Software
دوره 32 شماره
صفحات -
تاریخ انتشار 2017